Linear regresion - part 2

Many variables



In [8]:

    
# imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns
from IPython.display import (
    display, 
    Math, 
    Latex
)
%matplotlib inline

For many variables we will use vectorized implementation $$X=\left[\begin{array}{cc} 1 & (\vec x^{(1)})^T \\ 1 & (\vec x^{(2)})^T \\ \vdots & \vdots\\ 1 & (\vec x^{(m)})^T \\ \end{array}\right] = \left[\begin{array}{cccc} 1 & x_1^{(1)} & \cdots & x_n^{(1)} \\ 1 & x_1^{(2)} & \cdots & x_n^{(2)} \\ \vdots & \vdots & \ddots & \vdots\\ 1 & x_1^{(m)} & \cdots & x_n^{(m)} \\ \end{array}\right] $$

$$\vec{y} = \left[\begin{array}{c} y^{(1)}\\ y^{(2)}\\ \vdots\\ y^{(m)}\\ \end{array}\right] \quad \theta = \left[\begin{array}{c} \theta_0\\ \theta_1\\ \vdots\\ \theta_n\\ \end{array}\right]$$

Vectorized implementation is much faster than that one from previous Lecture.



In [14]:

    
df = pd.read_csv("ex1data1.txt", header=None)
df.columns = columns=['x', 'y']
X = np.matrix(df.x.values[:, np.newaxis])
# adding theta_0
m = len(X)
X = np.concatenate((np.ones((1,m)).T, X), axis=1)
y = np.matrix(df.y.values[:, np.newaxis])
theta = np.matrix([-5, 1.3]).reshape(2, 1)



In [15]:

    
print 'X', X[:10]
print 'y', y[:10]
print 'theta', theta









    



X [[ 1.      6.1101]
 [ 1.      5.5277]
 [ 1.      8.5186]
 [ 1.      7.0032]
 [ 1.      5.8598]
 [ 1.      8.3829]
 [ 1.      7.4764]
 [ 1.      8.5781]
 [ 1.      6.4862]
 [ 1.      5.0546]]
y [[ 17.592 ]
 [  9.1302]
 [ 13.662 ]
 [ 11.854 ]
 [  6.8233]
 [ 11.886 ]
 [  4.3483]
 [ 12.    ]
 [  6.5987]
 [  3.8166]]
theta [[-5. ]
 [ 1.3]]

Cost function

$$J(\theta)=\dfrac{1}{2|\vec y|}\left(X\theta-\vec{y}\right)^T\left(X\theta-\vec{y}\right)$$



In [3]:

    
def JMx(theta, X, y):
    m = len(y)
    J = 1.0/(2.0*m)*((X*theta-y).T*(X*theta-y))
    return J.item()



In [16]:

    
error = JMx(theta, X, y) 
display(Math(r'\Large J(\theta) = %.4f' % error))









    





$$\Large J(\theta) = 4.5885$$

How we count derivatives?

Let's count gradinet: $$\nabla J(\theta) = \frac{1}{|\vec y|} X^T\left(X\theta-\vec y\right)$$

Gradinet Descent (vectorized)

$$ \theta = \theta - \alpha \nabla J(\theta) $$

Assigment 1.

Implement vectorized GD Algorithm



In [ ]:

Normal matrix method

We can count $\hat\theta$ using this equation: $$\theta = (X^TX)^{-1}X^T \vec y$$

Assigment 2.

Implement normal matrix method and check if $\theta$ vector is the same in GD method and normal Matrix method Use pinv for computing inverse matrix. It's more numerical stable



In [ ]:

Gradient Method	Normal matrix
need to choose $\alpha$	no need to choose $\alpha$
needs many iterations	no iterations
it works for large amount of features (x)	slow for large amount of features (x)
	we need to count inverse matrix

Assigment 3.

Use library scikit-learn (normal matrix and gradient method) to fit model & predict y for x = 1, 10, 100



In [17]:

    
from sklearn.linear_model import (
    LinearRegression,
    SGDRegressor
)



In [ ]: